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Abstract 

Motivated by the philosophy and phenomenal success of compressed sensing, the prob- 
lem of reconstructing a matrix from a sampling of its entries has attracted much atten- 
tion recently. Such a problem can be viewed as an information-theoretic variant of the 
well-studied matrix completion problem, and the main objective is to design an efficient 
algorithm that can reconstruct a matrix by inspecting only a small number of its entries. 
Although this is an impossible task in general, Candes and co-authors have recently shown 
that under a so-called incoherence assumption, a rank r nxn matrix can be reconstructed 
using semidefinite programming (SDP) after one inspects 0(nr log 6 n) of its entries. In 
this paper we propose an alternative approach that is much more efficient and can recon- 
struct a larger class of matrices by inspecting a significantly smaller number of the entries. 
Specifically, we first introduce a class of so-called stable matrices and show that it includes 
all those that satisfy the incoherence assumption. Then, we propose a randomized basis 
pursuit (RBP) algorithm and show that it can reconstruct a stable rank r nxn matrix after 
inspecting 0(nr log n) of its entries. Our sampling bound is only a logarithmic factor away 
from the information-theoretic limit and is essentially optimal. Moreover, the runtime of 
the RBP algorithm is bounded by 0(nr 2 log n + n 2 r), which compares very favorably with 
the f2(n 4 r 2 log 12 n) runtime of the SDP-based algorithm. Perhaps more importantly, our 
algorithm will provide an exact reconstruction of the input matrix in polynomial time. By 
contrast, the SDP-based algorithm can only provide an approximate one in polynomial 
time. 
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1 Introduction 

A fundamental problem that arises frequently in many disciplines is that of reconstructing a 
matrix with certain properties from some partial information. Typically, such a problem is 
motivated by the desire to deduce global structure from a (small) number of local observations. 
For instance, consider the following applications: 

• Covariance Estimation. In areas such as statistics, machine learning and wireless com- 
munications, it is often of interest to find the maximum likelihood estimate of the covariance 
matrix £ G C mxm of a random vector v G C m . Such an estimate can be used to study 
the relationship among the variables in v, or to give some indication on the performance 
of certain systems. Usually, extra information is available to facilitate the estimation. For 
instance, we may have a number of independent samples that are drawn according to the 
distribution of v, as well as some structural constraints on E (e.g., certain entries of E _1 
have prescribed values [PJ, the matrix E has a Toeplitz structure and some of its entries 
have prescribed values [UJ, etc.). Thus, the estimation problem becomes that of complet- 
ing a partially specified matrix so that the completion satisfies the structural constraints 
and maximizes certain likelihood function. 

• Graph Realization. It is a trivial matter to see that given the coordinates of n points in 
M. k , the distance between any two points can be computed efficiently However, the inverse 
problem — given a subset of interpoint distances, find the coordinates of points (called a 
realization) in M fc (where k > 1 is fixed) that fit those distances — turns out to be anything 
but trivial (see, e.g., [331 E3 El])- Such a problem arises in many different contexts, such 
as sensor network localization (see, e.g., [H [36]) and molecular conformation (see, e.g, 
[TS[ [8]), and is equivalent to the problem of completing a partially specified matrix to an 
Euclidean distance matrix that has a certain rank (cf. [2H 125])- 

• Recovering Structure from Motion. A fundamental problem in computer vision 
is to reconstruct the structure of an object by analyzing its motion over time. This 
problem, which is known as the Structure from Motion (SfM) Problem in the literature, 
can be formulated as that of finding a low-rank approximation to certain measurement 
matrix (see, e.g., [7J). However, due to the presence of occlusion or tracking failures, 
the measurement matrix often has missing entries. When one takes into account such 
difficulties, the reconstruction problem becomes that of completing a partially specified 
matrix to one that has a certain rank (see, e.g., [7]). 

• Recommendation Systems. Although electronic commerce has offered great conve- 
nience to customers and merchants alike, it has complicated the task of tracking and 
predicting customers' preferences. To cope with this problem, various recommendation 
systems have been developed over the years (see, e.g., [T5], [321 [10]). Roughly speaking, 
those systems maintain a matrix of preferences, where the rows correspond to users and 
columns correspond to items. When an user purchases or browses a subset of the items, 
she can specify her preferences for those items, and those preferences will then be recorded 
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in the corresponding entries of the matrix. Naturally, if an user has not considered a par- 
ticular item, then the corresponding entry of the matrix will remain unspecified. Now, 
in order to predict users' preferences for the unseen items, one will have to complete a 
partially specified matrix so that the completion maximizes certain performance measure 
(such as each individual's utility [23]). 

Note that in all the examples above, we are forced to take whatever information is given to us. 
In particular, we cannot, for instance, specify which entries of the unknown matrix to examine. 
As a result, the reconstruction problem can be ill-posed (e.g., there may not be a unique or even 
any solution that satisfies the given criteria). This is indeed an important issue. However, we 
shall not address it in this paper (see, e.g., [201 [22 EEH1 [25] for related work). Instead, we take a 
different approach and consider the information-theoretic aspects of the reconstruction problem. 
Specifically, let A G jj mxn be the rank r matrix that we wish to reconstruct. For the sake of 
simplicity, suppose that r is known. Initially, no information about A (other than its rank) is 
available. However, we are allowed to inspect any entry of A and inspect as many entries as we 
desire in order to complete the reconstruction. Of course, if we inspect all ran entries of A, then 
we will be able to reconstruct A exactly. Thus, it is natural to ask whether we can inspect only a 
small number of entries and still be able to reconstruct A in an efficient manner. Besides being 
a theoretical curiosity, such a problem does arise in practical applications. For instance, in the 
sensor network localization setting [36], the aforementioned problem is tantamount to asking 
which of the pairwise distances are needed in order to guarantee a successful reconstruction of 
the network topology. It turns out that if the number of required pairwise distances is small, 
then we will be able to efficiently reconstruct the network topology by performing just a few 
distance measurements and solving a small semidefinite program (SDP) |38j. 




To get an idea of what we should aim for, let us first determine the degrees of freedom 
available in specifying the rank r matrix A G M mxn . This will give us a lower bound on the 
number of entries of A we need to inspect in order to guarantee an exact reconstruction. Towards 
that end, consider the singular value decomposition (SVD) A = UY>V T , where U G M mxr and 
V G W nxr have orthonormal columns, and S G W xr is a diagonal matrix. Clearly, there are 
r degrees of freedom in specifying S. Now, observe that for i — 1,2, ... ,r, the z-th column of 
U must be orthogonal to all of the previous i — 1 columns, and that it must have unit length. 
Thus, there are m — i degrees of freedom in specifying the z-th column of U, which implies that 
there are Y7i=i( m ~ *) = r {^ m — r — l)/2 degrees of freedom in specifying U. By the same 
argument, there are ^2 r i=1 {n — i) = r(2n — r — l)/2 degrees of freedom in specifying V. Hence, 
we have: 



degrees of freedom in specifying the matrix A. In particular, this implies that we need to inspect 
at least A entries of A, for otherwise there will be infinitely many matrices that are consistent 
with the observations, and we will not be able to reconstruct A exactly. 

Now, a natural question arises whether it is possible to reconstruct A by inspecting just 
0(A) of its entries. A moment of thought reveals that the answer is no, as the information 
that is crucial to the reconstruction of A may concentrate in only a few entries. For instance, 
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consider the rank one n x n matrix A = eief , where e\ = (1, 0, . . . , 0) G W 1 . This is a matrix 
with only one non-zero entry, and it is clear that if we do not inspect that entry, then there is 
no way we can reconstruct A exactly. 

From the above example, we see that our ability to reconstruct A depends not only on 
the number of entries we inspect, but also on which entries we inspect and on the structure of 
A. This motivates the following question: are there matrices for which exact reconstruction is 
possible after inspecting only 0(A) of the entries? More generally, is there any tradeoff between 
the "niceness" of the structure of A and the number of entries we need to inspect in order to 
reconstruct Al 

1.1 Related Work 

In a recent work [5], Candes and Recht studied the above questions and proposed a solution 
that is based on ideas from compressed sensing and convex optimization. They first defined a 
notion called coherence, which can be viewed as a measure of the niceness of a matrix and is 
motivated by a similar notion in the compressed sensing literature [3J. Informally, a matrix has 
low coherence if the information that is crucial to its reconstruction is well-spread (cf. the case 
where A = eiej). Then, they proposed the following algorithm for reconstructing any m x n 
matrix A: 

The Candes-Recht Algorithm 

1. Let T be a uniformly random subset of {l,...,m} x {!,..., n} with given cardinality 
|T| > 1. Inspect the (i, j)-th entry of A if G T, thus obtaining a set of values 
{A ir .(i,j)ET}. 

2. Output an optimal solution to the following optimization problem: 

minimize ||^||* 

subject to Xij = Aij for G T (1) 

X G R mxn 

Here, ||-X"||* is the so-called nuclear norm of X and is defined as the sum of all the singular 
values of X. 

Candes and Recht showed that if A has low coherence, then whenever |T| = f2(iV 5//4 r log N), 
where N = max{m, n} and r = rank(v4), the solution to problem (TjTJ will be unique and equal 
to A with high probability. In other words, by inspecting 0(N 5 ^r log N) randomly chosen 
entries of A and then solving the optimization problem ([T]), one can reconstruct A exactly with 
high probability. Note that problem ([1]) can be formulated as a SDP; see, e.g., JT3J Chapter 
5]. As such, it can be solved to any desired accuracy in polynomial time (see, e.g., [HI [37]). 
However, if one uses standard SDP solvers, then the runtime of the Candes-Recht algorithm is 
at least on the order of max {N 9 / 2 r 2 log 2 N, AT 15 / 4 r 3 log 3 iV} (see, e.g., [37] [SB]), which severely 
limits its use in practice. Although specialized algorithms are being developed to solve the SDP 
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associated with problem (CD) more efficiently (see, e.g., [21 [T7J, ES, EE]), they either do not have 
any theoretical time bound, or their runtimes can still be prohibitively high when N is large (at 
least on the order of iV 9//2 r 2 log N). 

Subsequent to the work of Candes and Recht, improvements have been made by various 
researchers on both the sampling and runtime bounds for the problem. In [22], Keshavan 
et al. proposed a reconstruction algorithm that is based on the SVD and a certain manifold 
optimization procedure. They showed that if the input matrix A has low coherence and low 
rank, then by sampling |T| = Q(Nr maxjlog N, r}) entries of A uniformly at random, their 
algorithm will produce a sequence of iterates that converges to A with high probability. Note 
that the sampling complexity of Keshavan et al.'s algorithm is just a polylogarithmic factor away 
from the information-theoretic minimum A and hence is almost optimal. However, their result 
applies only when the rank of A is bounded above by iV 1 / 2 , and the ratio between the largest 
and smallest singular values is bounded. Moreover, there is no theoretical time bound for their 
algorithm. Around the same time, Candes and Tao [6] refined the analysis in [5] and showed that 
the sampling complexity of the Candes-Recht algorithm can be reduced to |T| = f2(iVYTog 6 N) 
when the input matrix A has low coherence (but not necessarily low rank). Again, this is just a 
polylogarithmic factor away from the information-theoretic minimum A. However, the runtime 
of the algorithm remains high (at least on the order of iV 4 r 2 log 12 N). 

1.2 Our Contribution 

From the above discussion, we see that it is desirable to design a reconstruction algorithm 
that can work for a large class of matrices and yet still has low sampling and computational 
complexities. In this paper we make a step towards that goal. Specifically, our contribution is 
twofold. First, we introduce the notion of k -stability, which again can be viewed as a measure of 
the niceness of a matrix. Roughly speaking, anrnxn rank r matrix is fe-stable if every m x (n—k) 
sub-matrix of A has rank r, but some mx(n — k — l) sub-matrix of A has rank r — 1. Intuitively, 
if a low-rank matrix has high stability (i.e. when k is large), then the information that is crucial 
to its reconstruction is present in many small subsets of the columns, and hence the matrix 
should be more amenable to exact reconstruction. As it turns out, the notion of fc-stability is 
related to the so-called Maximum Distance Separable (MDS) codes in coding theory [23, Chapter 
11]. Moreover, from the above informal definition, we see that fc-stability is a combinatorial 
property of matrices, which should be contrasted with the more analytic nature of the notion of 
coherence as defined in [5]. Nevertheless, there is a strong connection between those two notions. 
More precisely, we show that if a matrix has low coherence, then it must have high stability. 
Such a connection opens up the possibility of comparing our results to those in [5J [221 E] • 

Secondly, we propose a randomized basis pursuit (RBP) algorithm for the reconstruction 
problem. Our algorithm differs from that of Candes and Recht [5] and Keshavan et al. [22] in 
two major aspects: 

1. We do not sample the matrix entries in a uniform fashion. Instead, we sample the columns 
(or rows) of the matrix uniformly. We note that such a sampling strategy is reminiscent 
of that used for constructing low-rank approximations to a given matrix [TH [T2"l I3"T] . 
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However, there is one crucial difference, namely, our sampling strategy does not require 
any knowledge of the input matrix. By contrast, the strategy used in [111 EE21 ED] assumes 
that the norm of each column of the input matrix is known. 

2. Our algorithm does not involve any optimization procedure and will produce an exact 
solution in polynomial time. This should be contrasted with the SDP-based algorithm of 
Candes and Recht, which can only return an approximate solution in polynomial time (see 
[30] for discussions on the complexity of solving SDPs); and with the spectral method of 
Keshavan et al., which is known to converge to an exact solution but has no theoretical 
time bound. 

Regarding the performance of our algorithm, we show that if the input matrix A has high 
stability (in particular, this includes the case where A has low coherence), then by sampling 
0(Nr log N) entries of A using our column sampling procedure, we can reconstruct A exactly 
with high probability. Furthermore, we show that the runtime of our algorithm is bounded 
above by 0(Nr 2 log N + N 2 r). Thus, on both the sampling and computational complexities, 
our bounds yield substantial improvement over those in [221 [6] . Moreover, our sampling bound 
is essentially optimal, as the extra logiV factor can be attributed to the coupon collecting 
phenomenon [2EJ Chapter 3] (see [5J [221 E] for related discussions) . 

1.3 Outline of the Paper 

In Section [2] we first introduce the notion of a fc-stable matrix and derive some of its properties. 
Then, we study the relationship between the notion of fc-stability and the notion of coherence 
defined in [5]. Afterwards, we study some constructions of fc-stable matrices and show that 
they are in fact quite ubiquitous. In Section [3] we propose a randomized basis pursuit (RBP) 
algorithm for the matrix reconstruction problem and analyze its sampling and computational 
complexities. Although the RBP algorithm assumes that the rank of the input matrix is known, 
we show how such an assumption can be removed in Section 13.31 Finally, we summarize our 
results and discuss some possible future directions in Section 0J 

2 The Class of k— Stable Matrices 

As mentioned in the Introduction, our ability to reconstruct a matrix depends in part on its 
structure. In this paper we shall focus on the class of fc-stable matrices, which is defined as 
follows: 

Definition 1 A rank r matrix A e ]^ mx ™ i s sa id f ft e ^-stable for some k e {0, 1, . . . , n — r} 

if 

1. every m x (n — k) sub-matrix of A has rank r; and 

2. there exists an m x (n — k — 1) sub-matrix of A with rank equal to r — 1. 
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In other words, the rank of a k-stable matrix A remains unchanged under the removal of any 
of its k columns. We use A4 mxn (k,r) C R mxn to denote the set of all k-stable rank r m x n 
matrices, and use M. mxn (k) C M mxn to denote the set of all k-stable m x n matrices. 

Note that the notion of fc-stability is defined with respect to the columns of a matrix. Of course, 
we may also define it with respect to the rows. However, unlike the notions of row rank and 
column rank — which are equivalent — a column /c-stable matrix may not be row fc-stable. 
Unless otherwise stated, we shall refer to a column /c-stable matrix simply as a fc-stable matrix 
in the sequel. 

As we shall see, the notion of fc-stability has many nice properties. For instance, it generalizes 
the notions of coherence defined in [SI IB]- Moreover, a matrix with high stability (i.e. k = G(n)) 
can be reconstructed by a simple and efficient randomized algorithm with high probability. 
Before we give the details of these results, let us first take a look at some (deterministic) 
constructions of k— stable matrices. 

Example 1 Let a = (ai, . . . ,a n ) G M. n be any vector with no zero component. Consider the 
m x n matrix A whose first row is equal to a T and all other entries are zeroes. Then, A is an 
(n — 1) -stable rank one matrix. 

Example 2 Let n > 1 be an odd integer. Let e el" be the vector of all ones, and let 

( n — 1 n — 1 n — l n— l\ „ 

u = , 7T- + 1, 7T- + 2, . . . , — e R n 



Consider the nxn matrix A whose first row is equal to e T and the i-th row is equal to u T , where 
i = 2, . . . , n. It is then easy to verify that A is an (n — 2) -stable rank two matrix. 

Example 3 Let m,n be integers with n > m > 1. Suppose that u = (u±, . . . ,u m ) G W 11 and 
v = (t>i, . . . ,v n ) G M. n are given. Consider the m x n matrix A defined by A^ = (v,i + Vj)~ l , 
where 1 < i < m and 1 < j < n. The matrix A is known as a Cauchy matrix. It is well- 
known that if the u\ 's are distinct, the Vj 's are distinct, and u\ + v j ^ for all 1 < % < m and 
1 < j < n, then every square sub-matrix of A is non-singular. In particular, this implies that 
A is an (n — m) -stable rank m matrix. 



2.1 Relation to the Notion of Coherence 

In all previous work on the matrix reconstruction problem [SI 1221 E], the notion of coherence 
is used to measure the niceness of a matrix. This immediately raises the question of whether 
fc-stability and coherence are comparable notions. It turns out that the former can be viewed as 
a generalization of the latter. Before we formalize this statement, let us first recall the definition 
of coherence [S] : 

Definition 2 Let U C M. n be a subspace of dimension r > 1, and let Pu be the orthogonal 
projection onto U. Then, the coherence of U is defined as: 

Tl 

fj,(U) = — max llPf/ejllo 

r l<i<n 
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where e, £ M. n is the i-th standard basis vector, for i = 1, . . . , n. 
A simple consequence of this definition is the following: 

Proposition 1 Let U £ R riXr be a matrix with orthonormal columns. When viewed as a sub- 
space ofMJ 1 (i.e. the subspace spanned by the columns ofU), the coherence of U is given by: 

liiU) = — max \\ui\\l 

r l<i<n 

where Ui is the i-th row of U , for i = 1, . . . , n. 

Proof Let v i, . . . , v r £ M. n be the columns of U. Since U has orthonormal columns, we have: 

r 

p u e i = /X e J v j) v j for z = 1, . . . , n 

whence: 

r 

\\ p uei\\l = ^(ejvj) 2 = \\ui\\l 

3=1 

as desired. □ 

The following invariance property of &;-stable matrices will be useful for establishing the rela- 
tionship between k— stability and coherence: 

Proposition 2 Let A £ ]R mx ™ be an arbitrary matrix, and let U £ R pxm be a matrix with 
linearly independent columns (in particular, we havep > m). Then, for any k £ {0, 1, . . . , n—r}, 
A is a k-stable rank r matrix iff UA is so. 

Proof Let Oi, ... , CL n £ M m be the columns of A. Then, the columns of UA are given by 
Uai, . . . , Ua n £ IR P . Now, for any / = 1, . . . , n, the number of linearly independent vectors in 
the collection {a^, . . . , a^} is the same as that in the collection {Ua^, . . . , Ua it }, since U has 
full column rank. This completes the proof. □ 

We are now ready to state our first main result: 

Theorem 1 Let A £ JR mxn be a rank r matrix whose SVD is given by A = UTy T , where 
U £ ]R mxr , V £ M rixr and E £ R rxr . For any non-negative integer k < n — r , if the coherence 
of V satisfies fi(V) < fi for some /x (0, f 1 ), then A is column s-stable for some s > k. 
Similarly, for any non-negative integer k' <m — r , if n{U) < /i for some /i £ (0, jj^), then A 
is row s-stable for some s > k'. 

Proof By Proposition [2j it suffices to show that V T is a column A;-stable rank r matrix, since 
UH £ M. mxr is a matrix with linearly independent columns. Now, consider the following cases: 

Case 1: r = 1 
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Let V = (vx, . . . , v n ) G W 1 . Then, by Proposition [T] and the fact that fi(V) < fio, we have: 

fJ>(V) = n max v\ < /io < — 

l<i<n k 

It follows that vf < 1/k for i — 1, . . . , n. Since Y^i=i v 1 = 1? we conclude that V must have at 
least k + 1 non-zero entries. It follows that V T is a column s-stable rank one matrix for some 
s > k, since the removal of any k columns from V T does not change its rank. 

Case 2 : r > 2 

Suppose that V T is only an /-stable rank r matrix for some < I < k — 1. Then, by definition, 
there exist / + 1 columns whose removal will result in a rank r — 1 sub-matrix of V T . Without 
loss of generality, suppose that those / + 1 columns are the last / + 1 columns of V T . Then, 
we may write V T = [RQ N], where R G W^^^Q G ^(r-i)x(n-i-i) ^ N E Rrxtf+i^ and q has 

orthonormal rows. Since V T has orthonormal rows, we have: 

I r = V T V = RR T + NN T = [R N][R N] T 

which means that the matrix [R N] G M rx ( r+/ ) also has orthonormal rows. In particular, we 
have \\RWf < r — 1 (here, || ■ ||^ is the Frobenius norm). Moreover, since \\[R N]\\ 2 F = r, we 
have: 



On the other hand, observe that: 



2 F = \\[R N}\\1 - \\R\\l > 1 (2) 



\N\\1< 1 ^< 1 -<1 
n k 



which contradicts (TSJ). Thus, we conclude that V T (and hence A) is a column s-stable rank r 
matrix for some s > k. 

The statement about the row stability of A can be established by considering A T = VTXJ T 
and following the above argument. □ 

One of the consequences of Theorem [T] is that if both the factors U and V in the SVD of A have 
small coherence relative to min{m, n}/r (which is the case of interest in the work [5j [22j [6]), 
then A has high row and column stability. Now, a natural question arises whether the converse 
also holds. Curiously, as the following proposition shows, the answer is no. 

Proposition 3 Let k G {1, . . . , n — 1} be arbitrary. Then, there exist n x n rank one matrices 
A that are both row and column k-stable, and yet the corresponding SVDs A = auv T satisfy 
mm{fi(u) , fi(v)} = O(n). 

Proof Let e G (0, 1/2) be fixed. Define u G W 1 as follows: 



Vl^e if i = 1 
Mj = <( yfTfh if 2 < % < k + 1 
otherwise 
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By construction, we have = 1. Now, consider the rank one matrix A = uu T . Since u T has 
k + 1 non-zero entries, it is column fc-stable. Thus, by Proposition [2] and the symmetry of A, 
we conclude that A is both row and column fc-stable. On the other hand, using Proposition [TJ 
we compute: 

fj,(u) = n max uf — (1 — e)n 

l<i<n 

This completes the proof. □ 

As can be seen in the proof of Proposition [3j the coherence of a matrix can be very sensitive 
to the actual values in its entries. This can be partly attributed to the fact that coherence is 
an analytic notion. By contrast, the notion of /c-stability is more combinatorial in nature and 
hence is not as sensitive to those values. 

Theorem [1] and Proposition [3] together show that the notion of fc-stability can be regarded as 
a generalization of the notions of coherence defined in [SI EJ ■ I n particular, various constructions 
of low-coherence matrices proposed in [5j [6] can be transferred to the high-stable case. However, 
it would be nice to have some more direct constructions of high-stable matrices. In the next 
section, we will show that matrices with high stability are actually quite ubiquitous. 



2.2 Ubiquity of k— Stable Matrices 

Let A e W xn be a matrix with full row rank (in particular, we have r < n). Then, it is clear 
that the maximum stability of A is n — r, and that the maximum can be attained. It turns out 
that such a situation is typical. More precisely, we have the following: 

Theorem 2 Let r,n be integers with n> r > 1. Then, the set S = M. rxn \Ai rxn (n — r, r) has 
Lebesgue measure zero when considered as a subset ofW n . 

The proof of Theorem [2] relies on the following well-known result: 

Proposition 4 Let f : M. 1 —>■ R be a polynomial function that is not identically equal to zero. 
Then, the solution set: 

f-\0) = {x E R l : f(x) = 0} 

has Lebesgue measure zero. 

A proof of Proposition H] can be found in [29J . 

Proof of Theorem [2] Suppose that A e W xn is not (n—r)— stable. Then, one of the rxr sub- 
matrices of A must be singular, or equivalently, has determinant zero. Since the determinant of 
a square matrix is a polynomial function of its entries, and since there are only finitely many 
rxr sub-matrices of A, it follows from Proposition H] that S has Lebesgue measure zero. □ 

Thus, by taking a generic r x n matrix R and an arbitrary m x r matrix Q whose columns 
are linearly independent, we may conclude from Proposition |2] and Theorem [2] that the m x n 
matrix A = QR has rank r and is column (n — r) -stable. 

In [S] the authors considered an alternative construction of rank r matrices using the so-called 
random orthogonal model. In that model, one constructs an m x n matrix A via A = WEV T , 
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where V G M. nxn is a random orthogonal matrix drawn according to the Haar measure on the 
orthogonal group 0(n), U G M mxm is an arbitrary orthogonal matrix, and £ G 
arbitrary matrix with the partition: 



E r 




: S r G W~ xr diagonal, (S r )jj ^ for i = 1, . . . , r 



By construction, the matrix A has rank r. Now, we claim that A is column (n — r)-stable 
with probability one (with respect to the Haar measure on 0(n)). To see this, observe that 
A = U r Y* r Vj ', where: 

U = [ U r U r ] , V = [ v T v r ] 

and U r G M mxr , V r G M nxr . Since U r 'E r has linearly independent columns, by Proposition 
[2j it suffices to show that Vj is column (n — r)-stable with probability one. Towards that 
end, it suffices to show that with probability one, every r x r sub-matrix of Vj has non-zero 
determinant. It turns out that the last statement is well-known; see, e.g., [2T| Lemma 2.2]. 
Thus, we have proven the following: 

Theorem 3 Let A be a rank r m x n matrix generated according to the random orthogonal 
model. Then, A is column (n — r) -stable with probability one (with respect to the Haar measure 
on 0(n) ). 

In [5] it is shown that the coherence of a rank r nxn matrix generated according to the random 
orthogonal model is bounded by 0(f/r) with probability 1 — o(l), where f = max{r, logn}. 
Using Theorem [TJ we see that such a matrix is column (n/f)-stable with probability 1 — o(l). 
This should be contrasted with the conclusion of Theorem [31 which is much stronger. 



3 The Randomized Basis Pursuit (RBP) Algorithm 

Let us now consider the algorithmic aspects of matrix reconstruction, particularly those that 
are related to the reconstruction of low-rank high-stable matrices. As briefly discussed in the 
Introduction, if a reconstruction algorithm can only inspect a small number of entries, then 
it should somehow inspect those that contain the most information. Of course, since there 
is no a priori information on the input matrix, every algorithm must at some point make a 
guess at which entries are important. Currently, the best algorithms for the reconstruction 
problem all pursue an entry-wise uniform sampling strategy [5], [22], [6]. Specifically, they all 
begin by sampling a uniformly random subset of the entries and inspecting the values in those 
entries. Such a strategy will certainly perform well when the information that is crucial to the 
reconstruction is well-spread, but could also fail miserably when those information is highly 
concentrated. As an illustration, consider the rank one mxn matrix A from Example HJ which 
has the form: 
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where a G M n has no zero component. Clearly, there is no hope of reconstructing A if we do not 
inspect all the entries in its first row. However, if the entry-wise uniform sampling strategy is 
used, then the probability that / randomly sampled entries of A will include all the entries in 
the first row is bounded above by: 



In particular, no algorithm that uses the entry-wise uniform sampling strategy will be able to 
reconstruct A with probability larger than e _1 even after sampling I = mn — m = 0(mn) of its 
entries! 

The above example shows that the entry-wise uniform sampling strategy may miss the 
critical structure of a matrix if that structure is localized. On the other hand, observe that the 
matrix A in the above example can be exactly reconstructed once we inspect its first row and 
any of its columns. In general, one may think of a low-rank matrix as being largely determined 
by a small number of its rows and columns. Such an intuition motivates the following matrix 
reconstruction algorithm. Note that the algorithm requires the knowledge of the rank of the 
input matrix. However, as we shall see in Section 13.31 such an assumption can be removed if we 
can bound the stability of the input matrix. 

Randomized Basis Pursuit (RBP) Algorithm 
Input: A rank r m x n matrix A, where r is known. 

1. Initialization : Initialize S <— and T <— {1, . . . ,n}. The set S will be used to store the 
column indices that correspond to the recovered basis columns of A. 

2. Basis Pursuit Step: 

(a) If T = 0, then stop. All the columns of A have been examined, and hence A can be 
reconstructed directly. 

(b) Otherwise, let j be drawn from T uniformly at random, and let Uj G M m be the 
corresponding column of A. Examine all the entries in Uj. If Uj is spanned by the 
columns whose indices belong to S, then repeat Step 2b. Otherwise, update: 



since Uj is a new basis column. Now, if \S\ = r, then proceed to Step 3. Otherwise, 
repeat Step 2. 

3. Row Identification : Let As be the mxr sub-matrix of A whose columns are those indexed 
by S. Find r linearly independent rows in As- Let S be the corresponding set of row 
indices, and let A s s be the corresponding r x r matrix. 




S^Su{j}, T^T\{j} 
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4. Reconstruction : Examine all the entries in the i-th row of A for all i G S. Now, the j—th 
column of A (where j G" £7) can be expressed as a linear combination of the basis columns 
indexed by S, where the coefficients are obtained by expressing the vector (aij)i e g G R' s ' 
as a linear combination of the columns of A$ s . 

It is not hard to show that when the above algorithm terminates, it will produce an exact 
reconstruction of the input matrix. To illustrate the flow of the algorithm, let us consider again 
the rank one matrix A from Example [1] (see Q). Since the first row of A has no zero component, 
any column selected in Step 2b of the algorithm can be the basis column. Now, suppose that 
j is the index of the selected column. After inspecting all the entries in the j—th column, the 
algorithm will identify the lxl sub-matrix Aij in Step 3, since Ay is the only non-zero entry 
in the j—th column. Consequently, the algorithm will examine all the entries in the first row 
of A in Step 4, thus obtaining all the information that is necessary for the reconstruction of A. 
Note that in this example, the total number of entries inspected by the algorithm is m + n — 1, 
which is exactly equal to the information-theoretic minimum. 

From the description of the RBP algorithm, we see that if the input matrix is of low rank but 
has many candidate basis columns, then the basis pursuit step will terminate sooner, and hence 
the number of entries inspected by the algorithm will also be lower. This is indeed the case when 
the input matrix has high stability (recall that the matrix from Example [I] is (n — l)-stable). 
Before we proceed with a formal analysis, let us remark that some additional speed up of the 
above algorithm is possible. For instance, in Step 2b, once we determine that a column lies 
in the span of those indexed by S, then we do not need to consider it anymore and hence its 
index can be removed from T. However, in order to facilitate the analysis, we shall focus on the 
version presented above. 

3.1 Sampling Complexity of the RBP Algorithm 

In this section we analyze the sampling complexity of the RBP algorithm. Specifically, our goal 
is to prove the following: 

Theorem 4 Suppose that the input rank r mxn matrix A to the RBP algorithm is k-stable for 
some k G {0, 1, ... ,n — r}, i.e. A G Ai mxn (k, r). Let 5 G (0, 1) be given. Then, with probability 
at least 1 — r5, the RBP algorithm will terminate with an exact reconstruction of A, and the total 
number of entries inspected by the algorithm is bounded above by nr + (k + l)~ 1 mnr(l + ln(l/5)). 

The following simple estimate will be used in the proof of Theorem |U 

Proposition 5 Let X be a geometric random variable with parameter p G (0, 1). Then, for any 
5 > 0, we have: 

Pr f x > l±i\ < e -> 
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Proof We compute: 




= P . (i-p) ra+«)/pi-i.^(i_p)j 

i=o 

< e- 5 



This completes the proof. □ 

Proof of Theorem [4] Observe that Step 2 of the RBP algorithm is the only place where 
randomization is used, and that once Step 2 is completed, the algorithm will always terminate 
with an exact reconstruction of A. Thus, it suffices to obtain a high probability bound on 
the number of times Step 2b is being executed throughout the entire course of the algorithm. 
Towards that end, let us divide the execution of Step 2 into epochs, where the i-th epoch (for 
i = 0, 1, . . . , r— 1) begins at the iteration where \S\ = % for the first time and ends at the iteration 
just before the one where \S\ — i + 1. Let Pi be the probability that the column selected in an 
iteration of the i-th epoch is a basis column. Note that Pi is a random variable that depends 
on which i basis columns are selected in the previous i epochs. However, since the input matrix 
is assumed to be fc-stable, we have: 

Pi > — ^— for i — 0, 1, . . . , r — 1 

n — i 

Now, let Yi be the number of times Step 2b is being executed in the i-th epoch. Then, Yj 
is a geometric random variable with parameter p^, and the number of times Step 2b is being 
executed throughout the entire course of the algorithm is given by: 

r-l 



y = -£y, 



i=0 



By Proposition [5l we have: 



l + ln(l/5)\ 
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It follows that with probability at least 1 — r5, the total number of times Step 2b is being 
executed is bounded above by: 



r(2n -r+1) f 1\ 

< 1^7(1 + ^ 

Note that the above quantity is also an upper bound on the number of distinct columns examined 
by the algorithm. Thus, we see that with probability at least 1 — r8, the total number of entries 
inspected by the algorithm is bounded above by: 

mnr ( l\ 

nr + 1 + In - 

k+ 1 V SJ 

and the proof is completed. □ 

Upon combining the results of Theorem [3] and Theorem HI we obtain the following corollary, 
which significantly improves the result in [6]: 

Corollary 1 Let A be a rank r m x n matrix generated according to the random orthogonal 
model. Then, with probability at least 1 — 0(n~ 3 ), the RBP algorithm will terminate with an 
exact reconstruction of A, and the total number of entries inspected by the algorithm is bounded 
above by 0(n + mlogn) when r — 0(1), and by 0(n\ogn + m log 2 n) when r = O(logn). 



3.2 Implementation and Complexity Analysis of the RBP Algorithm 

In this section we discuss some of the implementation details of the RBP algorithm and analyze 
its computational complexity. Clearly, the initialization step can be done using 0(n) operations. 
For the basis pursuit step, we need to determine whether a newly selected column is in the 
span of the current basis columns (i.e. those indexed by S). This can be achieved via a Gram- 
Schmidt type process. Specifically, we maintain a set U of orthonormal vectors with the following 
property: during the i-th epoch (where i = 0, 1, . . . , r — 1), the set U will contain i orthonormal 
vectors wx,...,Wi G M m , whose span is equal to that of the columns indexed by S. Now, suppose 
that the algorithm selects the column v G M m . To test whether v G span{w l5 . . . ,Wi}, we first 
compute: 

i 

IL» EE ^2(wfv)w t G IT 
1=1 

and then check whether Hi(v) = v (we set Hq(v) = 0). If this is the case, then we have 
v G span{u>!, . . . ,Wi}, whence we can proceed to select another column. Otherwise, v is a new 
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basis column. Thus, we add its index to the set S and add the unit vector (v— Ui(v))/\\v — ILi(v)\\2 
to the set U before continuing to the next instruction. 

To determine the time needed by the basis pursuit step, observe that during the z-th epoch, 
each selected column requires 0(im) operations. Since i < r — 1, by Theorem HI we conclude 
that with probability 1 — r5, the total number of operations executed in the basis pursuit step 
is bounded by 0((k + l)~ l mnr 2 log(l/<5)). 

In the row identification step, we need to find r linearly independent rows in the m x r 
matrix As- This can be achieved by a Gram-Schmidt type process similar to the one outlined 
above, and the total number of operations required is bounded by 0{mr 2 ). 

Finally, in order to carry out the reconstruction step, we can first compute the inverse of the 
non-singular r x r matrix A§ s using 0(r 3 ) operations. Then, for each j (jL S, we can express 
the vector (aij) ieS G M' 5 ' as a linear combination of the columns of A§ s using 0(r 2 ) operations. 
Afterwards, the j—th. column can be reconstructed using 0(mr) operations. Since we need to 
reconstruct at most n — 1 columns, it follows that the total number of operations required in 
the reconstruction step is bounded by 0(r 3 + nr 2 + mnr) = 0(mnr). 

To summarize, we have the following: 

Theorem 5 Suppose that the input rank r mxn matrix A to the RBP algorithm is k-stable for 
some k G {0, 1, . . . , n — r}, i.e. A G A4 mxn (k,r) . Let 5 G (0, 1) be given. Then, with probability 
at least 1 — r8, the total number of operations performed by the RBP algorithm is bounded by 
0((k + l) _1 mnr 2 log(l/<5) + mnr). 

Note that Theorem only gives a probabilistic bound on the runtime. However, the bound 
can be made deterministic by suitably modifying the RBP algorithm. Specifically, we can add 
a counter to keep track of the total number of times Step 2b is being executed. Once that 
number exceeds (k + l) -1 nr(l + ln(l/<5)), we stop the algorithm and declare failure. With such 
modification, the conclusion of Theorem H] still holds. However, we now have a deterministic 
bound of 0((k + l) _1 mnr 2 log(l/<5) + mnr) on the runtime. We remark that such an idea can 
also be used to develop a "rank-free" version of the RBP algorithm, i.e. one that does not 
require the knowledge of the rank of the input matrix. We refer the reader to Section 13.31 for 
details. 

The time bound obtained in Theorem [5] compares very favorably with that for the SDP- 
based algorithm of Candes and Recht [5]. Perhaps more importantly, our algorithm will produce 
an exact reconstruction of the input matrix in polynomial time. By contrast, the Candes-Recht 
algorithm can only produce an approximate reconstruction in polynomial time. This is due to 
the fact that SDPs can only be solved to a fixed level of accuracy in polynomial time. We refer 
the reader to [30] for further discussion on this issue. 

3.3 A Rank-Free RBP Algorithm 

Recall that the RBP algorithm introduced earlier assumes that the rank of the input matrix 
is known. However, in practice, there is very little a priori information on the input matrix. 
This raises the question of whether one can design a reconstruction algorithm that does not 



16 



need the rank information. It turns out that this is possible if we modify the RBP algorithm 
using the idea mentioned at the end of the last sub-section. Specifically, we keep track of the 
number of attempts made by the algorithm to find the next basis column. If that number 
reaches a pre-specified threshold, say A, then we exit the basis pursuit step and proceed to 
the row identification step of the algorithm. The idea is that if A is sufficiently large and the 
algorithm fails to find a new basis column after A drawings, then it probably has found all the 
basis columns and hence the input matrix can be exactly reconstructed. To formalize this idea, 
let us first give a precise description of the proposed algorithm. 

Rank-Free Randomized Basis Pursuit (RF-RBP) Algorithm 
Input: An m x n matrix A, stopping threshold A > 1. 

1. Initialization : Initialize S <— 0, T <— {1, . . . , n} and k <— 0. The set S will be used to store 
the column indices that correspond to the recovered basis columns of A. The counter k 
will be used to keep track of the number of attempts made to find the next basis column. 

2. Basis Pursuit Step: 

(a) If T = 0, then stop. All the columns of A have been examined, and hence A can be 
reconstructed directly. Otherwise, reset k <— and proceed to Step 2b. 

(b) Let j be drawn from T uniformly at random, and let Uj G M. m be the corresponding 
column of A. Examine all the entries in Uj. 

If Uj is spanned by the columns whose indices belong to S, then increment k *— k+1. 
If k > A, then proceed to Step 3. Otherwise, repeat Step 2b. 

If Uj is not spanned by the columns whose indices belong to S, then Uj is a new basis 
column. Update: 

S^SU{j}, T^T\{j} 

and repeat Step 2. 

3. Row Identification : Let As be the m x j^l sub-matrix of A whose columns are those 
indexed by S. Find \S\ linearly independent rows in A$. Let S be the corresponding set 
of row indices, and let A§ s be the corresponding \S\ x \S\ matrix. 

4. Reconstruction : Examine all the entries in the i-th row of A for all % e S. Now, express 
the j-th column of A (where j ^ S) as a linear combination of the basis columns indexed 
by S, where the coefficients are obtained by expressing the vector (aij) ieS G M' 5 ' as a linear 
combination of the columns of A s s . 

Again, we are interested in the sampling complexity of the RF-RBP algorithm. It turns out 
that if the input matrix is known to be fc-stable for some k > 0, then the sampling complexity 
of the RF-RBP algorithm is comparable to that of the RBP algorithm. Specifically, we prove 
the following: 
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Theorem 6 Suppose that the input m x n matrix A to the RF-RBP algorithm is k-stable for 
some k > k , i.e. A e Ai mxn (k). Let 8 G (0, 1) be given, and set: 

A = log(— 1 )/*s(l-*±±) (4) 

Then, with probability at least 1 — 8, the RF-RBP algorithm will terminate with an exact recon- 
struction of A, and the total number of entries inspected by the algorithm is bounded above by 
nr + m(r + 1)A, where r = rank(A). 

Remarks 

1. Since log(l — (k + l)/n) < — (k + l)/n, Theorem [6] guarantees that the total number of 
entries inspected by the RF-RBP algorithm is bounded by: 

mn(r + 1) /minim, n\ 
nr H — log 1 



k + 1 °\ 8 

In particular, when 8 is inversely proportional to a polynomial in min{m, n}, the bound 
above is of the same order as that obtained for the RBP algorithm (see Theorem H]). 

2. To appreciate the power of Theorem consider an m x n matrix A whose rank r is known 
to be much smaller than min{m, n}, say, r < n/2. If A is generic, then by Theorem 
121 it is fc-stable, where k = n — r > n/2. Hence, by Theorem [BJ the matrix A can be 
exactly reconstructed by the RF-RBP algorithm with high probability, and the number 
of inspected entries is bounded by 0(nr + mr log n). Note that such a reconstruction is 
done without the algorithm knowing the exact value of r or k. By contrast, the algorithm 
of Keshavan et al. [22] is much less flexible, as it needs to know the exact value of r in 
order to guarantee an exact reconstruction. 

Proof of Theorem [5] For j = 1,2, ... ,r, let qj be the probability that the RF-RBP algorithm 
finds at least j basis columns before proceeding to Step 3. We claim that: 



j 



k + l 
n — i + 1 



forj = l,...,r (5) 



The proof of ([5]) will proceed by induction on j. To facilitate the proof, let us again divide the 
execution of Step 2 into epochs, where the (j — l)-st epoch (for j = 1,2, ... ,r) is defined in 
exactly the same way as in the proof of Theorem HI Furthermore, let pj be the probability that 
the column selected in an iteration of the (j — l)-st epoch is a basis column. Since A is assumed 
to be fc-stable, we have: 

k + l 

Pj > : for i = 1, 2, . . . , r 



n — j + 1 

Now, for j = 1, we have: 



qi = l-(l-pi) A > 1 



k + l 



n 
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and hence the base case holds. Suppose that (EJ) holds for some j < r. Then, conditioned on the 
event that the RF-RBP algorithm finds at least j basis columns before proceeding to Step 3, 
the probability that the RF-RBP algorithm finds at least j + 1 basis columns before proceeding 
to Step 3 is given by: 



cond 

Qj+i 



i-O-Pj 



+1/ 



> 1 



fc + 1 
n-j 



Hence, it follows from the definition of conditional probability and the inductive hypothesis 
that: 

A 



j+1 

Qi+i = c + n i d * > n 
i=i 



i 



fc+i 

n — i + 1 



This completes the proof of (jSJ). 

Now, observe that the RF-RBP algorithm will terminate with an exact reconstruction of A 
iff it finds r basis columns before proceeding to Step 3. Using (jSJ) and the definition of A in 
we see that the probability of such an event is at least: 



n 

i=l 



1 - 



fc + 1 

n — i + 1 



> 



> 1 



n 



8 



min{m, n} 



min{m,n} 



> 1-5 



Moreover, the number of distinct columns inspected by the algorithm is always bounded above 
by (r + l)A, which implies that the total number of entries inspected by the algorithm is bounded 
above by nr + m(r + 1)A. This completes the proof of Theorem [6j □ 

Finally, upon following the proof of Theorem El one can easily establish the following complexity 
result for the RF-RBP algorithm: 

Theorem 7 Given an m x n matrix A and a stopping threshold A > 1, the total number of 
operations performed by the RF-RBP algorithm before it terminates is bounded by 0(mr 2 A + 
mnr), where r = rank(A). 

We remark that the bound in Theorem [7] holds for arbitrary input matrices. In the case where 
the input matrix has rank r and is fc-stable, we can set A as in (j4|) and bound the total number 
of operations by: 

/ mnr 2 ( minim, n}\ \ 

°Un log l — s — ) +mnr ) 

In particular, when 8 is inversely proportional to a polynomial in min{m, n}, the bound above 
is of the same order as that obtained for the RBP algorithm (see Theorem E])- 
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4 Conclusion 



In this paper we proposed a randomized basis pursuit (RBP) algorithm for the matrix recon- 
struction problem. We introduced the notion of a fc-stable matrix and showed that the RBP 
algorithm can reconstruct a fc-stable rank r n x n matrix with high probability after inspecting 
0((k + l)~ 1 n 2 r logn) of its entries. In addition, we showed that the runtime of the RBP algo- 
rithm is bounded by 0((k + l)~ 1 72 2 r 2 log n + n 2 r). Our results yield substantial improvement 
over those in existing literature ( [3 [221 [6] ) , i n the sense that the RBP algorithm can reconstruct 
a larger class of matrices by inspecting a smaller number of entries, and it can do so in a more 
efficient manner. Although the RBP algorithm assumes that the rank of the input matrix is 
known, we showed that such an assumption can be removed. Specifically, we proposed a variant 
of the RBP algorithm that can reconstruct a matrix without knowing the exact value of its rank. 
Such a feature offers great flexibility in practical settings. Finally, it would be interesting to 
study the tradeoff between the sampling complexity and computational complexity of the ma- 
trix reconstruction problem. Another interesting direction would be to extend our techniques to 
handle the case where the sampled entries are noisy. Some recent results along this direction, 
which are established using the techniques of [SI E] , can be found in [4] . 
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